Nesterov's accelerated gradient methods (AGM) have been successfully appliedin many machine learning areas. However, their empirical performance ontraining max-margin models has been inferior to existing specialized solvers.In this paper, we first extend AGM to strongly convex and composite objectivefunctions with Bregman style prox-functions. Our unifying framework covers boththe $\infty$-memory and 1-memory styles of AGM, tunes the Lipschiz constantadaptively, and bounds the duality gap. Then we demonstrate various ways toapply this framework of methods to a wide range of machine learning problems.Emphasis will be given on their rate of convergence and how to efficientlycompute the gradient and optimize the models. The experimental results showthat with our extensions AGM outperforms state-of-the-art solvers on max-marginmodels.
展开▼